Skip to content

Option to re-display a benchmark file #185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Jul 10, 2025

Conversation

jaredoconnell
Copy link
Collaborator

closes #175

This adds a command to re-display a prior benchmarks file in the CLI.

Before marking this as ready for review, we need to decide what command format we want to use. During the call with Mark we discussed this being an option within the benchmark command.
Also, let me know if the stripped down results file is a good one to use. I manually removed data from a large results file.

@jaredoconnell jaredoconnell marked this pull request as ready for review June 16, 2025 17:04

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15786325345/artifacts/3372583658.
They will be retained for up to 30 days.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15829107664/artifacts/3384635667.
They will be retained for up to 30 days.

Copilot

This comment was marked as outdated.

Copy link
Collaborator

@markurtz markurtz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @jaredoconnell; overall, the code looks good. I want to expand the functionality to encompass not only displaying the results but also reexporting them to a new file if the user desires.

Given that, I'd recommend something along the lines of the following for the CLI:
guidellm benchmark report PATH --output OUTPUT where output is optional. If an output file is supplied, we will resave the report to that file path using the extension as the file type. It could also potentially be named export, convert, or anything along those lines.

For the benchmark CLI pathways, it would then look like the following:
guidellm benchmark run ...
guidellm benchmark report ...

And if that ACTION after the benchmark is not supplied, we will default to filling it in as run. This way, we namespace all of the commands under benchmark and add flexibility towards the future.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15913354854/artifacts/3414683301.
They will be retained for up to 30 days.

@jaredoconnell
Copy link
Collaborator Author

This is ready for re-review.

Since the last reviews, there is an option to re-export the benchmarks, and the command format was changed.

The default command feature isn't supported by Click so I am using a class to handle the new behavior. I included some external code for the default command as opposed to doing that myself because I found that there were a lot of edge cases that broke the functionality.

Should I document this command as Step 6 in the quick start?

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15913513325/artifacts/3414739016.
They will be retained for up to 30 days.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15935011408/artifacts/3422105673.
They will be retained for up to 30 days.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15935670062/artifacts/3422335403.
They will be retained for up to 30 days.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15935873594/artifacts/3422412684.
They will be retained for up to 30 days.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15935873594/artifacts/3422457310.
They will be retained for up to 30 days.

@jaredoconnell
Copy link
Collaborator Author

The CI errors appears to be from using an older commit's code. That's very odd.

@markurtz markurtz requested a review from Copilot July 8, 2025 15:19
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds a command and related infrastructure to load and display an existing benchmark report via the CLI.

  • Introduces reimport_benchmarks_report and wires it into a new guidellm benchmark from-file subcommand
  • Adds print_full_report to GenerativeBenchmarksConsole and updates the main benchmark group to use DefaultGroupHandler
  • Includes unit tests/assets for re-display behavior and updates docs to reference the new run subcommand

Reviewed Changes

Copilot reviewed 14 out of 16 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/guidellm/benchmark/entrypoints.py Added reimport_benchmarks_report and simplified console calls
src/guidellm/benchmark/output.py Added print_full_report, updated unsupported-extension error
src/guidellm/main.py Converted benchmark to a subcommand group with run & from-file
src/guidellm/utils/default_group.py Added DefaultGroupHandler to support default subcommands
tests/unit/entrypoints/.../test_benchmark_from_file_entrypoint.py New tests for display and re-export
docs/outputs.md Updated examples to use benchmark run
docs/datasets.md Updated examples to use benchmark run
README.md Updated examples to use benchmark run
.pre-commit-config.yaml Excluded test assets from formatting rules
Comments suppressed due to low confidence (3)

src/guidellm/main.py:36

  • To automatically execute run when users invoke guidellm benchmark without any args, add default_if_no_args=True to the DefaultGroupHandler parameters.
)

src/guidellm/benchmark/entrypoints.py:154

  • [nitpick] This docstring ends abruptly at "Can also specify". Please complete it to explain what additional options are available when re-importing a report.
    existing benchmarks report. Can also specify

docs/outputs.md:3

  • [nitpick] The new benchmark from-file command isn’t mentioned here. Consider adding a section with usage examples for guidellm benchmark from-file to show how to re-display existing reports.
GuideLLM provides flexible options for outputting benchmark results, catering to both console-based summaries and file-based detailed reports. This document outlines the supported output types, their configurations, and how to utilize them effectively.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/16154373917/artifacts/3490281464.
They will be retained for up to 30 days.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/16195522505/artifacts/3504218172.
They will be retained for up to 30 days.

@markurtz markurtz merged commit f1dfdc6 into vllm-project:main Jul 10, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Re-load and re-display a saved benchmark.json report
3 participants